Hierarchical Relative Entropy Policy Search

نویسندگان

Christian Daniel

Gerhard Neumann

Jan Peters

چکیده

Many reinforcement learning (RL) tasks, especially in robotics, consist of multiple sub-tasks that are strongly structured. Such task structures can be exploited by incorporating hierarchical policies that consist of gating networks and sub-policies. However, this concept has only been partially explored for real world settings and complete methods, derived from first principles, are needed. Real world settings are challenging due to large and continuous state-action spaces that are prohibitive for exhaustive sampling methods. We define the problem of learning sub-policies in continuous state action spaces as finding a hierarchical policy that is composed of a high-level gating policy to select the low-level sub-policies for execution by the agent. In order to efficiently share experience with all sub-policies, also called inter-policy learning, we treat these sub-policies as latent variables which allows for distribution of the update information between the sub-policies. We present three different variants of our algorithm, designed to be suitable for a wide variety of real world robot learning tasks and evaluate our algorithms in two real robot learning scenarios as well as several simulations and comparisons.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online learning in episodic Markovian decision processes by relative entropy policy search

We study the problem of online learning in finite episodic Markov decision processes (MDPs) where the loss function is allowed to change between episodes. The natural performance measure in this learning problem is the regret defined as the difference between the total loss of the best stationary policy and the total loss suffered by the learner. We assume that the learner is given access to a ...

متن کامل

Learning to Serve and Bounce a Ball

In this paper we investigate learning the tasks of ball serving and ball bouncing. These tasks display characteristics which are common in a variety of motor skills. To learn the required motor skills for these tasks the robot uses Relative Entropy Policy Search which is a state of the art method in Policy Search Reinforcement Learning. Our experiments show that REPS does not only converge cons...

متن کامل

Twenty Questions for Localizing Multiple Objects by Counting: Bayes Optimal Policies for Entropy Loss

We consider the problem of twenty questions with noiseless answers, in which we aim to locate multiple objects by querying the number of objects in each of a sequence of chosen sets. We assume a joint Bayesian prior density on the locations of the objects and seek to choose the sets queried to minimize the expected entropy of the Bayesian posterior distribution after a fixed number of questions...

متن کامل

A Hierarchical Approach in Multilevel Thresholding Based on Maximum Entropy and Bayes' Formula

An efficient hierarchical approach for image multi-level thresholding is proposed based on the maximum entropy principle and Bayes’ formula, in which no assumptions of the image histogram are made. Five forms of conditional probability distributions are employed for optimal threshold determination. Our experiments demonstrate that the proposed method is effective and achieves a significant impr...

متن کامل

A Relative Entropy Approach to Constructing Hierarchical Summaries

Hierarchies provide a means of organizing, summarizing and accessing information. This paper describes a method for automatically generating hierarchies from small collections of text. A formal framework is presented which uses relative entropy to identify words that are both topical and predictive of the vocabulary used to discuss the topics in the collection. These two features lead to the cr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 17 شماره

صفحات -

تاریخ انتشار 2012

Hierarchical Relative Entropy Policy Search

نویسندگان

چکیده

منابع مشابه

Online learning in episodic Markovian decision processes by relative entropy policy search

Learning to Serve and Bounce a Ball

Twenty Questions for Localizing Multiple Objects by Counting: Bayes Optimal Policies for Entropy Loss

A Hierarchical Approach in Multilevel Thresholding Based on Maximum Entropy and Bayes' Formula

A Relative Entropy Approach to Constructing Hierarchical Summaries

عنوان ژورنال:

اشتراک گذاری